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ABSTRACT 

The  gradient  method  for  solving  two-point  boundary- 
value  problems  is  discussed  and  a  modification  of  the 
gradient  direction  is  proposed.   An  algorithm  for 
efficiently  determining  the  step  size  is  also  derived. 
Analytic  and  numerical  examples  illustrating  the  efficien- 
cy of  the  method  are  presented. 
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1.   THE  PROBLEM  AND  NOTATION 

The  objective  of  controlling  a  physical  system  is  to 
make  the  system  function  in  the  most  desirable  manner.   If 
the  system  happens  to  be  a  production  machine  in  a  factory, 
the  application  of  the  best  control  will  yield  the  maximum 
machine  output  for  a  specified  expenditure  of  raw  materi- 
als, power,  and  production  time.   Perhaps  the  system  is  an 
aircraft  which  is  to  be  automatically  controlled  during 
landing.   In  this  situation  the  best  choice  of  control 
might  be  expected  to  minimize  deviations  from  a  specified 
flight  path. 

The  system,  or  plant  to  be  controlled,  is  assumed  to 
be  in  the  state  variable  form: 


x(t)  =  a(x(t),u(t),t) 


(1) 


where 


x  = 


X- 


X, 


n 


a  (n  x  1)  vector   u  = 


u- 


u. 


u 


m 


a  (m  x  1)  vector 


The  effectiveness  of  a  controlled  process  at  accom- 
plishing an  assigned  task  is  measured  by  a  functional 
called  the  performance  index  or  performance  measure,  J(u_). 
The  performance  measure  is  assumed  to  be  of  the  form: 


rfcf 


J(u)  =  h  [x(tf)ftf]  +  j   g(x(t),u(t),t)  dt 


(2) 


The  optimal   ontrQl  is  the  control  which  minimizes 
the  performance  m<  isure. 

There  are  two  general  approaches  that  can  be  followed 
when  computing  the  optimal  control.   The  first  of  these, 
dynamic  programming,  reduces  the  problem  to  one  of  making 
a  finite  number  of  optimal  decisions  starting  at  t  =  t- 
and  working  backwards  in  time.   Decisions  are  based  on  the 
"principle  oi     Lmality"  due  to  Bellman   •  <■  W  w)-io   }ias 
used  the  method  extensively.   While  efficient,  this  method 
suffers  from  the  requirement  of  large  amounts  of  computer 
storage. 

The  alternative  to  dynamic  programming  lies  in  varia- 
i  Lonal  calculus.   In  this  method,  the  performance  measure 
is  augmented  by  Lagrange  in      Lers  and  the  functiuna'J 


J  ( u )  =  d  ( u)   f 
a 


tt. 

p"  (t)  (a  (x(t)  ,  u  { t)  ,  t)  -x  { t) )  dt    (3) 


is  obtained.   The  superscript  T  denotes  the  transpose  of  a 

vector  or  matrix.  p   denotes  the  costate  (or  adjoint)  vec- 

and  is  of  dimension     1. 

When  the  state  equal  Li  ns  are  satisfied,  as  they  mrfst  be, 

J  -  J  since  the  integrand  of  (3)  vanishes, 
a 

Since  a  minimum  of  the  augmented  functional  is  sought, 

the  vari<       I  0   (denol    60  )  Is  obtained  and  the  fun- 

a  a 

damenta.l  theorem  of  vat  —his  is  empl  ,-  i  bo 

derive  necessary  conditions  wh        t  be  satisfied  by  a 
can  ii  so3     ti« 


It  is  convenient  at  this  point  to  introduce  a  func- 
tion, 3C,  the  Hamiltonian,  which  is  lefined  as: 


K(x(t).u(t),p(t),t)  ^  g(x(t),u(t),t)+pT(t)[a(x(t),u(t),t] 

(4) 

Using  this  notation,  the  necessary  conditions  for  op- 
timality  are:  [See^   ] 


x*(t)  =  7  K(&*(t),ii*(t),£*Ct),t) 

ir 


P*(t)  -  -  V     K(x*(t),u*(t),p*(t),t) 


0    =  Vu  K(x*(t),u*(t),p*(t),t) 


(5) 

(6) 
(7) 


and 


* 
h(x 

L  X 


vv  Mx "(tf),tf)  -p "(tf)_  6xp 


(8) 
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_K(x*(t f),u*(t f),P*(t f),t  )+  j-   h(x*(t f),t f)]«t  =  0 


The  star  superscript  denotes  the  optimal  solution  and  V  K 


denotes  the  matrix 


*x7 


ajc 


n 


6x„  is  the  final  variation  of  the  state  vector  and  6tc  is 
— F  t 


the  variation  of  the  final  time. 


The  Lues  Of  the  sta         assumed  to  be 

the  relati  ishij  ffina]  conditii  ns 

of  the  States  and  costat.es  are  given  by  equation  (8). 
Thus  the  problem  is  a  non  linear  two-point  boundary- value 
prob] em. 

Many  proposals  for  solving  this  problem  have  been 

made  to  date.   Among  these  methods  are  the  gradient  tech- 

( 4 )  { 5 )  f .  ( 7  \ 

nique  ,  vai  i     n  of  extremal?.-;    ,  and  the  meth- 

od of  quasilinearization  (generalized  Newton- Raphson  meth- 

(8) 
od)    .   All  oi"  these  methods  involve  an  initial  guess 

followed  by  a   Iterative  jSrocedu     signed  to  calculate 

th)   p1  Lmal  solution.   Variation  of  extremals  requires 

the  designer  to  select,  initial  values  of  the  costate  equa- 

ns  (p(t  H  and  Iterate  until  all  final  b  undarv  oondi- 
o 

>ns  are  satisfied.  Quasilinearization  demands  an  ini- 
tial seled  the  state  and  coState  trajectories  and 
I  h<   i  I  -  •       >  ■  cess  i  tie  difference  be- 

•  een  successive  I      tori es  becomes  sufficiently  small. 
Both  of  these  methods  may  penalize  the  designer  who  makes 
-  ess  by  non         ■        (ire  rex , 

■  rerg     ,  when         .,  '  ft.   fi       t]  discussi 

Oi  •'  ;-e  gradient  m  bl  tod  is  the  '  pi    •     '  Lon  2, 


2.   THE  GRADIENT  METHOD 

In  this  section  the  gradient  method  is  presented. 
The  difficulties  associated  with  the  method,  as  well  as 
its  attributes,  are  discussed. 
2. 1   Description  of  the  Gradient  Method 

Consider  the  system: 

x(t)  =  a(x(t),u(t)  ,t)  (1) 


and  the  performance  measure : 

t 
J(u)  =  h 


Lx(tf),tfJ  + 


f 
g(x(t),u(t),t)dt         (2) 
t 


which  is  to  be  minimized. 

The  necessary  condition  for  a  solution  to  be  optimal 

* 

is  that  the  variation  6j  (u  )  be  zero,  for  problems  with 

a, 

no  constraints  imposed  on  the  control.   Taking  the  varia- 
tion of  the  augmented  functional  J   and  equating  to  zero 

3. 

results  in  the  necessary  conditions  stated  in  equations 

(5), (6), (7),  and  (8). 

The  gradient  method  is  started  by  selecting  a  trial 
control  history  u    and  satisfying  equations  (5) , (6) ,  and 

(8) .   The  following  equation  then  represents  the  entire 

performance  measure  variation. 

tf 
6j(u(i))=  J   VuT5<(x(i)(t),u(i)(t),p(i)  (t),t)  6u(t)dt 
o 

(9) 
By  choosing  6u_(t)  correctly,  a  lower  value  of  the  perform- 


ance  measure  can  be  found.   The  process  is  then  repeated 
with  the  new  trial  control,  u^     .   This  control  is  selec 
ted  according  to  the  relation: 


u(i+1)(t)  =u(i)(t)  -  /5  VuK(x(i)(t),u(i)(t),p(i)(t),t) 

If  the  procedure  converges  on  the  nth  iteration,  the  result 
is  u(n) (t)  =  u*(t) . 

Since  the  equations  in  all  but  a  very  few  trivial 
problems  are  complicated  (and  usually  non  linear)  and  so- 
lutions in  closed  form  are  not  readily  available,  the  sets 
of  differential  equations  are  integrated  numerically  on  a 
digital  computer.   Then  x(t)  and  p(t)  are  not  known  for 
all  values  of  t,  but  are  available  only  at  a  discrete  set 

of  times,  t.,  where  t. ,,  =  t.  +  At,  with  At  being  the  in- 
1         l+l    1  r 

tegration  step  size.   By  similar  arguments,  the  choice  of 
control  histories  is  restricted  to  some  subset  of  U  (the 
set  of  all  admissible  control  histories)  which  is  defined 
at  the  set  of  times  (t.)  available  during  the  computation. 

A  practical  choice  is  to  make  u(t)  a  member  of  a  set, 
called  arbitrarily  Q,  of  all  piecewise  constant  functions. 
That  is  u(t)  =  u., t€[t. , t.+  At  J  . 

All  other  classes  of  functions  are  approximated  by  CI 

for  the  limiting  case  as  At  r*  0.   The  member  of  the  set  O, 

chosen  is  denoted  Q   ,  the  subscript  denoting  the  number  of 

subintervals  in  the  control  interval  [t  ,  t,.].   The  maximum 

o   f 

number  of  piecewise  constant  controls  in  the  control  in- 
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terval  is  determined  by  the  integration  step  size  and  is 


calculated  from  L  = 


t--t 

f  o 


6j(u    )  =  0  implies  either 


a-   v/(x{i)(t),u(i)(t),p(i)(t),t)HO,  te[to,tf] 


or 


b.   6u(t)  =  0. 


or 
t 


c. 


fr 


t 
o 


_VUT5C  (2£U)  (t),uU)  (t),p(i)  (t),t)  6u(t)]dt  =  0 

(10) 


Case  b  above  represents  the  trivial  solution  and  is 
of  no  value.   Of  the  non  trivial  choices,  case  a  above  is 
immediately  appealing  as  it  is  identical  to  equation  (7) 
of  the  set  of  necessary  conditions.   But,  since  u(t)  is  re- 
stricted to  the  set  PL  (the  price  of  numerical  computa- 
tion) ,  6J (u  )  can  only  equal  zero  if  the  state  and  costate 
vectors  also  happen  to  be  piecewise  constant  or  if  the  op- 
timal control  happens  to  be  piecewise  constant.   For  this 
case  there  is  no  clear  cut  strategy  to  improve  the  control 
history  and  therefore  proceed,  eventually,  to  the  optimal 
control.   Present  gradient  techniques  resort  to  satisfying 
the  requirements  dictated  by  equation  (7)  at  the  discrete 
points  where  all  required  information  is  available  and  use 
this  data  in  the  selection  of  a  "better"  control  history. 
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This  technique  is  satisfactory  initially,  but  yields  poor 
results  as  the  optimum  is  approached. 

Note  should  be  made  here  that  the  gradient  algorithm 
can  only  approach  the  true  optimum  as  the  integration  step 
size  At  approaches  zero.   This  is  caused  by  demanding  that 
u(t)  be  a   member  of  the  set  0. .   It  is  not  felt;  however, 
that  the  error  will  be  of  great  concern  in  problems  of 
engineering  interest. 
2 . 2   Motivation  for  Selection  of  the  Gradient  Method 

The  gradient  method,  while  suffering  from  difficulties 
yet  to  be  discussed,  has  some  appealing  attributes. 

The  first  favorable  characteristic  is  the  variable  to 
be  guessed,  which  in  this  method  is  the  control  history. 
The  designer  is  much  more  likely  to  have  insight  into  a 
suitable  control  history  than  either  the  remaining  unknown 
boundary  conditions  required  by  variation  of  extremals  or 
the  trajectories  required  by  quasilinearization.   For  a 
stable  system,  the  initial  guess  at  a  control  might  well 
be  to  "do  nothing"  and  guess  u/   (t)  =  0_,  te[t  ,  t,.]. 
Equally  important  is  the  fact  that  the  initial  guess  is 
not  usually  crucial  to  the  success  of  the  method. 

Another  feature  of  the  gradient  method  is  its  rela- 
tive ease  of  programming.   For  a  stable  system  all  inte- 
grations are  numerically  stable,  as  the  stable  state  equa- 
tions are  integrated  forward  in  time  while  the  unstable 
costate  equations  are  integrated  backwards  in  time.   All 
equations  are  relatively  easy  to  derive.   A  flow  diagram 
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A 


Guess  u^0'  (t) 


Forward  Integration  of 
State  Equations  to 

evaluate  performance 
measure  and  final 

boundary  condtions 


Backward  Integration  of 

Costate  Equations  to 
determine  the  gradient 


Is  the  norm  of  the  gradient  less 
than  the  termination  criterion? 


Vno 


u(i+1)(t)=u(i)(t)-/3x  VuK 


Figure  1. 
Flow  diagram  of  the  Gradient  Algorithm 
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of  the  algorithm  is  shown  in  Figure  1. 

2 . 3   Difficulties  Encountered  with  the  Gradient  Method 

As  mentioned  above,  the  gradient  method  is  plagued 
with  some  rather  poor  qualities,  apparently  the  price  of 
forgiveness  in  the  selection  of  the  initial  guess.   These 
problems  are  difficulties  involved  in  selecting  the  step 
size  (j3)  and  slow  convergence  in  the  vicinity  of  the 
optimum.   A  proposed  change  in  the  selection  of  a  gra- 
dient is  presented  in  Section  3. 
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•  3.   A  MODIFICATION  TO  THE  GRADIENT 
An  alternative  solution  to  the  dilemma  of  finding  a 
strategy  that  will  converge  to  the  optimum  choice  of  con- 
trol (in  the  set  CL)    is  to  restrict  6u(t)  to  the  set  CL 
also  and  base  a  strategy  on  case  c  (equation  (10))  of  sec- 
tion 2.1.   With  6u(t)  so  restricted  equation  (9)  can  be 
rewritten  as : 


6j  0>)  =  g^f  j    Vu5C  (ill)  (t)^(i)  (tK£(i)  (t)t)  dt 


j=0     j 

(ID 

and  the  equivalent  of  case  c  is  obtained  by  setting 
6j(u  )  =0.   Since  6u.  is  not  zero  (except  in  the  trivial 
case) ,  the  remaining  condition  for  optimality  can  be  im- 
mediately written  as: 

J        r   *      *      *     \    must 

VUK  (ji  (t),u  (t),p  (t),tj)dt   =   0  (12) 

t .    — 
3 

j   —  0  ,   ±  ,  Z  ,    .    .    .    ,  Li 

For  a  scalar  control  equation  (12)  may  be  expressed  as: 

,t.+At 
3        f   *      *      *      N    must 

VUK  (x    (t),u  (t),p  (t),tjdt   =   0  (13) 

j 

j=0,l,2/...,L 
Let 

t  .+At 

VUK  (x(t),u(t),p(t),t)  dt  ^  F.  (x(t),u(t),p(t)) 

j 
With  u  restricted  to  the  set  Q_  this  functional  may  be 
written  as  F .  =  F . (x,u, ,u0, . . . ,u . , . . . ,  u , ,p) .   From  this 

J  J  L       £  J  Li  — 
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an  expression  for  6u . ,  the  control  variation  in  the  jth 
interval  is  sought.   An  approximation  to  this  variation 
may  be  obtained  by  expanding  F .  about  the  reference  tra- 
jectory. 

F . (x+6x,un  +6u, ,u0  +  6u0, . . . ,u .  +  6u . , . . . , uT  +  6u_ ,p+6p) 
j        ±         ±      z         z  DD       J-iJ-i  —  — 

OF  .       of  . 
=  F.(x,u  u2,...,u    ..,u  £)+  ^-1  6x  +  ^-L  6Ul 

j  j  _         i 

of.,  of.  of.     OF. 

+  -5— L  6u„+ +  ^— !■  6u.+  •••  +  -5— ^6^+  -5— 1  6p 

0uo    2         ou.    ]         on        L   op   — 
^  3  Jj      — 

+  higher  order  terms 

The  terms  containing  first  partial  derivatives  com- 
prise the  first  variation  of  F.  and  will  be  denoted  6F . . 
Of  interest  is  the  value  of  6F .  caused  by  the  perturbation 
of  the  jth  control. 

Let   6u .  =0,  i  ^   j 

1  J 

Then 

OF.       OF.        OF. 

6F .  =  -5— 1  6x  +  -5— !  6u  .  +  -5— !  6p 
j    ox   —   ou .    3    op   £- 

OF .  OF. 

Assuming  that  5  ^  6x  and  5  ^  6p  are  small  enough  to  be 

neglected  leads  to  a  first  order  approximation  of  6F . . 
OF.  D 

6F  .  =  ■5—^-  6u  . 

D    ou.    3 
3 

Then 

F . (x+6x,u, ,u0, . . . , u .+6u . ,  .  .  .  ,uT  ,p+  6p  )  m 

3     —       —        L        Z  J  J  Li    —        ■*- 

OF-A 
F.  (X,U;,,U2, ,U., Ut'P)   +  "g~^  6uJ 

D  J  "  3 

For  F.  in  some  neighborhood  of  zero,  u.  must  lie  in  some 
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neighborhood  of  u  .  . 

Let 

F- (x+6X'ui'u2' . . . , u -+6u • ,  ,uL,p+6p) 

=  F  .  (x  ,u1#u2/ ,  u.  , »UL'P  )  =0 

Then 

F.  (x,u1,u2/ ,u  .  ,  .  .  .  /UL,p) 

+  -§^j—  F  •  (x,  ux ,  u2 , ,  u  . , ,  uL,  p)  fiu  .  =  0 

F  .  (x,  u,  ,  u~ , ,  u  .  , ,  u  ,  p) 

and   6u.  =  -  -j J — (14) 

3  -§^—  F.  (x,u1#u2#  .  .  ,,u.  ,  .  ...,u  fp) 

J  J 

If  the  control  terms  of  the  system  state  equations 
are  no  greater  than  quadratic  and  the  control  terms  of  the 
performance  measure  are  quadratic,  the  denominator  of  equa- 
tion (14)  is  a  constant.   For  this  case  a  nominal  value  of 

6u  .  =-F.(x,ulM..,uT,p)/At  may  be  employed. 
3     3  J-       *->  — 

In  the  notation  of  the  problem  this  modified  express- 
ion is  written  as : 

t \+At 

1   i  J 
6u.  =  -  ^  J    *uK(x(t),u(t),£(t),t)  at  (15) 

t . 

3 

This  expression  indicates  that  an  integral  mean  value  stra- 
tegy is  employed  to  determine  6u . . 

The  time  history,  or  collection  of  all  the  6u . ,  is 
taken  to  be  the  gradient  of  J(u  )  with  respect  to  the 
control,  in  the  sense  that  adjusting  the  controls  locally 
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in  the  direction  of  the  gradient  will  result  in  the  great- 
est change  in  the  performance  measure. 

Use  of  this  technique  leads  to  a  modified  control  gra- 
dient which  is  employed  with  the  standard  gradient  algo- 
rithm.  If  the  modification  is  valid,  the  resulting  gra- 
dient should  possess  certain  properties  and  eventually  lead 
to  the  optimum  choice  of  control.   A  general  discussion  of 
these  properties  is  presented  in  Section  4. 
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4.   A  TEST  OF  METHOD  EFFICIENCY 

The  analogy  between  the  gradient  method  and  a  hill 
climbing  procedure  is  presented.   Two  possible  climbing 
techniques  are  stated,  one  of  which  generates  orthogonal 
gradient  vectors.   A  comparison  of  problems  solved  by  the 
standard  and  improved  gradient  methods  is  made. 
4. 1   The  Gradient  Method  as  a  Hill  Climbing  Procedure 

The  gradient  method  for  the  solution  of  optimal  con- 
trol problem  is  often  compared  to  the  hypothetical  problem 
of  a   survey  party  attempting  to  climb  to  the  top  of  a 
mountain  (in  our  case  descend  to  the  floor  of  a  valley)  in  a 
dense  fog.   It  is  desirable  to  reach  the  unknown,  and  un- 
seen summit  in  a  minimum  amount  of  time. 

The  head  surveyor  is  faced  with  two  likely  strategies 
by  which  he  may  achieve  the  summito   He  can  take  a  local 
survey,  move  a  few  feet  in  the  direction  of  steepest  as- 
cent, and  then  take  another  local  survey  to  modify  his  di- 
rection.  He  may  be  reasonably  assured  that  by  use  of  this 
technique  the  shortest  distance  between  his  starting  point 
and  the  top  of  the  mountain  will  be  covered.   Unfortunate- 
ly, a  very  high  number  of  surveys  will  probably  be  neces- 
sary to  maintain  position  on  the  path  with  the  ensuing  haz- 
ard of  much  "wasted"  time. 

An  alternative  to  this  strategy  is  to  make  a  local 
survey  and  then  proceed  to  climb  in  the  indicated  direction 
until  no  further  ascent  is  possible.   At  this  time  another 
survey  is  made  and  the  climb  resumes  in  the  new  direction 


19 


determined.   This  approach  is  likely  to  deviate  from  the 
path  of  steepest  ascent  traversed  before  and  runs  the  risk 
of  running  into  plateaus  which  would  give  a  false  illusion 
of  being  at  the  summit.   Its  main  virtue  is  the  likelihood 
that  fewer  surveys  will  be  required  and,  hopefully,  the 
time  of  ascent  will  be  diminished. 

When  following  the  second  strategy,  it  is  noticed 
that  successive  gradients  will  always  be  orthogonal.   If 
this  was  not  true,  a  further  improvement  could  have  been 
achieved  by  continuing  to  climb  in  the  direction  of  the 
previous  gradient. 

The  problem  described  above  also  occurs  when  using 
the  gradient  technique  to  compute  an  optimum  control.   The 
optimal  control  is  the  one  which  lies  in  the  mathematical 
valley  defined  by  the  state  equations  of  the  system  and 
the  structure  of  the  selected  performance  measure.   Like 
the  surveyor,  the  control  designer  can  only  observe  local 
conditions  and  must  use  this  data  in  his  optimization  pro- 
cedure. 

If  a  strategy  similar  to  the  alternative  plan  proposed 
above  is  followed,  the  computation  of  the  optimal  control 
proceeds  as  follows; 

A  control  is  selected  and  the  gradient  is  calculated, 
establishing  a  direction  of  search.   A  number  of  trials 
are  made  to  determine  the  minimum  value  of  the  performance 
measure  that  can  be  found  by  moving  along  this  gradient. 
At  the  minimum  value  point  a  new  gradient  is  computed  and 


20 


the  process  is  repeated.   When  no  improvement  can  be  at- 
tained in  any  direction,  the  process  is  assumed  to  have 
converged  to  a  local  minimum. 

Using  this  type  of  procedure,  if  an  accurate  search 
for  a  minimum  is  performed,  and  if  the  gradient  is  correc- 
tly evaluated,  successive  gradients  will  be  orthogonal. 
4. 2   Comparison  of  Standard  and  Modified  Gradient  Results 

Problems  computed  using  the  standard  gradient  tech- 
niques approximate  the  orthogonality  property  in  the  ini- 
tial stages,  but  as  the  optimum  is  approached  any  approx- 
imation to  orthogonality  completely  breaks  down.   The  pro- 
cess finally  "converges"  with  an  indication  that,  while 
the  norm  of  the  gradient  is  sufficiently  large,  no  further 
improvement  can  be  made  by  any  move  in  the  indicated  gra- 
dient direction.   This  leads  to  the  conclusion  that  the 
true  gradient  has  not  been  computed  from  the  standard  tech- 
niques. 

When  the  modified  gradient  is  used  with  the  same 
search  procedure,  successive  gradients  maintain  the  pro- 
perty of  orthogonality  throughout  the  problem  computation 
and  improvement  is  always  achieved  by  moving  along  the 
indicated  gradient.   Convergence  is  smoother  than  that  of 
the  standard  method  and  is  usually  achieved  by  the  re- 
quirement that  the  norm  of  the  gradient  be  less  than  some 
small  specified  constant.   The  conclusion  is  that  the 
modification  leads  to  the  computation  of  an  improved  gra- 
dient of  J  with  respect  to  the  control  u(t) . 


21 


5.   TWO  SINGLE  VARIABLE  SEARCH  PROCEDURES 
Computation  of  the  gradient  establishes  a  direction 
in  which  a  search  for  a  minimum  will  prove  fruitful.   The 
size  of  the  step  to  be  made  in  this  direction  remains  to 
be  determined.   Two  single  variable  search  procedures  are 
presented  by  which  this  step  size  may  be  calculated. 

5. 1   The  Golden  Section  Search 

(9) 

The  Golden  Section  search  described  by  Wilde     pro- 
vides an  efficient  means  for  finding  the  minimum  of  an 
unknown,  but  assumed  unimodal,  function,  once  the  minimum 
is  known  to  lie  in  a  specified  closed  interval.   The  prin- 
cipal properties  of  unimodality  of  interest  are  that  for  a 
unimodal  function,  f(x),  only  a  single  minimum,  f(x  ), 
exists  in  a  closed  interval,  I,  and  for  two  points,  a  and 
b,  both  in  the  interval,  and  on  the  same  side  of  the  mini- 
mum,  f  (a)  <  f  (b)   if   |  x   -a  |  <  |  x   -b  |  . 

The  method  is  described  briefly  below. 

Given:  an  interval  of  length  Ln  containing  a  single  minimum 
of   f(x) 

Find  :  an  interval  of  length  L-.  <  Ln  such  that  the  minimum 
of   f(x)  is  contained  in  L-. 

A  typical  problem  is  shown  in   Figure  2.    Let  the  lower 
bound   of  the  interval  be  at   a   and  the  upper  bound  at 
b.    Then  the  interval  length  Ln  is  given  by   L„  =  b-a. 
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Figure    2. 

Typics.1    intervals    in   the    Golden 
Section  search   algr.rJLth  i 
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Place  an  experiment  in  the  interval  at  point  c  and  let 

L      L, 
L,  =  c-a„   Point  c  is  chosen  such  that  —■  =  7 = —  .   One 

o     2 
solution  of  this  equation  is  - —  =     ■  =  1.68034.   Call 

Ll    75^1 
this  number  T.   Then  c  =  a+L,  =  a+Ln/T.   Another  experi- 
ment is  placed  symmetrically  in  the  interval  at  a  point 
c«  =  b-LQ/T. 

The  following  decision  is  made  on  the  basis  of  the  experi- 
ments at  points  c  and  c ' : 

* 
if  f(c')  <  f(c)  then  a  <  x   <  c  and  L,  =  c-a 

* 
if  f(c')  =  f(c)  then  c'<  x   <  c  and  L,  =  c-c' 

if  f(c')  >  f(c)  then  c'<  x   <  c  and  L,  =  b-c ' 

Thus  an  interval  L,  <  Ln  containing  the  minimum  has  been 
generated  due  to  the  unimodality  of  f (x)  and  the  placement 
of  the  experiments. 

The  advantage  of  the  method  becomes  apparent  when 
further  redcutions  of  the  interval  are  attempted,  the  ob- 
jective being  to  generate  a  sequence  of  decreasing  inter- 
val lengths  such  that  L   <  y,    where  y   denotes  the  stopping 
criterion  of  the  search. 

To  generate  L^,  experiments  are  placed  in  the  reduced 
interval  L,  in  the  same  manner  as  the  experiments  were 
placed  in  the  original  interval.   Assume  that  a  <  x   <  c 
and  thus  L.,  =  c-a.   Two  new  experiments  are  to  be  placed 
in  this  new  interval  at  points  d  and  d1. 

Then  d  =  a+L,/T   and   d'  =  c-L,/T 

2 

But  c'  =  b-L0/T  and,  noting  that  L,  =  LQ/T#  d  =  a+L0/T  • 
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An  expression  for  c ' -d  may  be  written  as: 

c ' -d  =  L  (1-^--^)  =0   .'.  c  '  =d 

T 

Since  one  experiment  already  exists  in  the  reduced  inter- 
val L,  ,  only  one  additional  experiment  must  be  performed 
to  further  reduce  the  interval. 

To  apply  this  search  method  to  the  gradient  algorithm 
the  interval  of  the  search  must  first  be  defined.   It  is 
already  known  that  moving  in  the  gradient  direction  will 
guarantee  at  least  some  improvement  in  the  performance 
measure.   A  crude  search  is  performed  until  the  perform- 
ance measure  is  greater  than  the  value  found  at  the  origin 
of  the  search.   This  is  achieved  by  initially  taking  a 
unit  step  in  the  gradient  direction  and  doubling  subse- 
quent step  sizes  until  this  condition  is  met.   The  result- 
ant closed  interval  must  then  contain  the  minimum  and  can 
be  reduced  by  means  of  the  golden  section  search  procedure. 
Table  1  shows  the  number  of  experiments  required  to  reduce 
an  interval  of  unit  length  to  a  desired  length  L  . 

Application  of  the  golden  section  search  and  modified 
gradient  technique  to  the  problems  run  to  date  has  result- 
ed in  convergence  in  a  very  few  iterations  and  established 
the  validity  of  the  modified  gradient  method.   Many  experi- 
ments are  required  per  iteration,  however;   and  so  the 
total  number  of  forward  integrations  that  must  be  perform- 
ed is  still  rather  high.   A  method  to  improve  this  short- 
coming is  discussed  next. 
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5. 2   A  Quadratic  Approximation  Search  Technique 

Since  the  search  for  a  minimum  in  the  gradient  direc- 
tion involves  the  single  variable  ft,    J  can  be  written  as 
some  unknown  function  of  this  variable  J(£).   From  pre- 
vious  calculations  J(0)  is  known  and  -rg  J(0)  is  known  to 
be  less  than  or  equal  to  zero,  with  equality  applying  at 
the  optimum  solution  of  the  main  problem. 

An  alternative  solution  to  performing  a  golden  section 

search  is  to,  rather  boldly,  assume  that  J(£)  is  a  quadra- 

2 

tic  of  the  form  J(/3)  =  aj3  +bj3+c  and  then  to  use  the  mini- 
mum of  this  quadratic  as  the  optimum  value  of  £.   While 
there  is  no  reason  to  believe  that  this  should  be  as  accu- 
rate as  a  search  procedure,  it  requires  far  fewer  experi- 
ments to  achieve  an  estimate. 

Given  a  quadratic,  f(x),  and  three  points  x,  <  x?  <x_, 
it  can  be  shown  that  if  f(x,)  >  f(x2)  and  f(x_)  >  f(x„), 

then  x,  <  x   <  x_  where  f (x  )  =  min  f (x) . 

x 
A  quadratic  may  be  fitted  exactly  to  any  three  points 

x..,x2,  and  x_  in  the  following  manner. 

2 
Let  f(x)  =  ax  +bx+c  where  a,  b,  and  c  are  unknown  con- 
stants . 
Using  the  values  of  f (x)  at  the  three  known  points 

the  following  equations  may  be  written: 

2 
f(x, )  =  ax.  +bx,+c 

2 
f(x2)  =  ax2  +bx2+c 

2 
f(x  )  =  ax..  +bx_+c 
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or  in  matrix  notation 


x. 


X  2 

X2 

x  2 

X3 


fl 

f2  " 
f3 


-  ** 

xl 

1 

a 

x2 

1 

b 

X3 

1 

c 

_  _ 

The  expression  is  easily  solved  for  the  vector  of 
coefficients. 


x  2 

x3 


The  matrix 


x. 


x. 


-1 

—    — 

xl 

1 

fi 

x2 

1 

f2 

X3 

1 

f3 

_           _ 

X. 


X. 


x. 


X, 


X. 


is  positive  definite  and 


therefore  its  inverse  always  exists.   Thus  the  co- 
efficients a,  b,  and  c  can  always  be  computed. 

The  minimum  of  this  quadratic  is  found  by  setting  the 

* 

derivative  equal  to  zero.   This  results  in  x   =  -b/2a. 

The  three  test  points  through  which  the  quadratic  is 
to  be  fitted  are  again  found  rather  crudely.   The  first 
point  is  placed  at  the  origin  of  the  search.   No  experi- 
ment need  be  performed  here  as  the  value  of  the  perform- 
ance measure  is  already  known.   A  test  is  next  performed 
with  a  step  size  of  unit  length  and  the  results  of  this 
test  determine  the  direction  of  further  search  along  the 
gradient. 
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1.  If  the  performance  measure  at  this  point  is  great- 
er than  that  at  the  origin  the  third  test  point 

is  placed  midway  between  the  preceding  test  and 
the  origin.   Successive  tests  are  made  in  this 
manner  until  the  value  of  the  performance  measure 
is  less  than  that  found  at  the  origin,  and  final- 
ly, the  quadratic  form  is  fitted  to  the  three 
points  closest  to  the  origin. 

2.  If  the  performance  measure  at  the  first  test 
point  is  less  than  that  at  the  origin,  the  step 
size  is  doubled  and  further  tests  are  placed  in 
this  manner  until  the  performance  measure  increas- 
es in  value.   The  quadratic  is  then  fitted  through 
the  three  test  points  that  are  farthest  from  the 
origin. 

Once  the  three  points  have  been  determined,  the  con- 
stants a  and  b  are  easily  evaluated  and  the  quadratic  ap- 
proximation  of  j8  generated. 
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6.   AN  ANALYTIC  EXAMPLE 

A  simple  example  will  serve  to  illustrate  some  of 
the  differences  between  the  standard  gradient  method  and 
the  method  utilizing  the  modified  gradient  computation. 

Consider  the  linear  first  order  system: 

x(t)  =  -x(t)  +  u(t) 
and  the  performance  measure : 


J  =  x2(tf)  = 


f*f    2 

l/,u  (t)  dt 

t   ^ 
o 


The  following  additional  data  is  given. 

t   =  0,  tf  =  1,  x(tQ)  =  xn  =  4-°'  x(tf)  unspecified 

Forming  J   and  taking  the  variation,  the  necessary 
a 

conditions  can  be  written. 

1.  x  (t)  =  -x  (t)  +  u  (t)   with  B.C.  x  (tQ)  =  xQ 

2.  p*(t)  =  p*(t) 

3.  p*(t^  =  2x*(tf) 

4.  u*(t)  =  -p*(t) 

These  equations  may  be  solved  simultaneously  for  the  opti- 
mal solution. 

(t)  =  x0e_t-k/2  [e^e^] 

p*(t)  =  kefc,  p*(l)  =  2x*(l) 

u  (t)  =  -ke11 


* 
x 
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with  x      =   4.0 

k   =   0.58063,    u*(t)    =    -O.SSOeSe11 

x  (1)  =  0.78915,   J*  =  1.16126214 
This  will  be  considered  as  the  reference  solution. 

To  exaggerate  the  difference  between  the  two  methods, 
the  best  control  in  the  set  CL    will  be  calculated.   With 
this  choice  of  control,  the  following  relationships  are 

found. 

— t      — t 
x(t)  =  xne   +u(l-e   ) 

p(t)  =  ket,  k  =  2e"tfx(tf) 

VuK(x(t),u(t),p(t),t)  =  u+ket 

The  procedure  is  started  by  selecting  an  initial  con- 
trol history  and  choosing  the  next  control  according  to 
the  method  selected.   If  the  integration  is  done  numerical- 
ly on  a  digital  computer  with  a  step  size  of  1.0,  the  only 
values  of  x  and  p  available  are  those  at  the  end  points  of 
the  interval.   The  initial  guess  of  u(t)  will  be  0.0. 
Case  I.   Standard  gradient  procedures 

6u  is  chosen  to  satisfy  equation  (7)  at  discrete  points  in 
the  interval.   For  an  extreme  case  6u  is  based  on  condi- 


tions existing  at  t  =  0. 

Then  6u(l)  =  -u(l)-k 

.    (i+1)     (i)  -  (i) 
and   u     =   u    +ou 
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Applying  this  to  the  problem  equations: 

x(l)  =  x0e-1+u(:L)  (1-e-1) 

k  =  2e~1x(l) 

u  x    =  -k 

Elimination  of  variables  results  in  a  difference  equation 
in  u. 

u(l+1)=  -0.271x0-0.469u(l) 

* 
which  converges  to  u   =  -0.1845x  . 

Application  of  this  control  to  the  system  results  in 
u*  =  -0.739,   x*(l)  =  1.006,   J*  =  1.285 


note  that:     u(t)  +  p(t) 


t=Q=  -0.739+0.739  =  0 


Apparently  the  "gradient"  is  zero  and  the  optimal  control 
has  been  found. 

However,  a  direct  search  among  controls  in  the  set  0L 
results  in 

u*  =  -1.034,   x*(l)  =  0.817,   J*  =  1.203 
obviously  a  better  choice  of  control  than  that  generated 
by  the  standard  gradient  method. 
Case  II .   Modified  gradient  method 
6u  is  chosen  to  satisfy 


1   ■  -c 
tf-t~    [u(t)+p(t) ]dt+6u(l)  =  0 


which  is  the  integral  expression  in  equation  (15) . 
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*£, 


Then  6u(l)  =  -~    [u (l) (t) +p (l) (t) ]dt 


t 
o 


where  At  =  t^-t 
f  o 


=  "  Sfe  [u^At+kfe1-!)] 


and 


(i+1)     k  ,  1  . . 
u    =  "  SE(e  -x) 

After  application  to  the  problem  equation  set  and 
manipulation,  a  new  difference  equation  in  u  is  obtained. 

u(l+1)=  -0.465x0-0.8u(l) 

with  solution 

u   =  -0.258xQ 

Application  of  this  control  to  the  system  results  in 

u*  =  -1.034,   x*(l)  =  0.818,   J*  =  1.203 
This  is  virtually  the  same  result  as  found  by  direct 
search. 

Of  the  choices  offered  by  the  two  methods,  the  modi- 
fied gradient  (which  is  scalar  for  this  problem)  converged 
to  the  optimum  solution  for  this  type  of  control (u(t) cH,  ) . 
The  control  gradient  for  this  problem  is  sketched  in 
Figure  3.  which  shows  conditions  existing  after  the  op- 
timum control  has  been  found.   Examination  of  this  figure 
reveals  that  this  is  not  the  optimum  solution  when  all 
admissible  choices  of  control  are  considered.   A  suitable 
choice  for  an  admissible  6u(t)  is  shown  in  the  figure. 
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Figure  3 

Linear  Regulator  -   v„K(t)  vs.  time  at  optimal 
solution  for  class  O,  control.  An  admissible 
control  variation  is  shown. 
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The  resulting  6J  is  0.256e. 

What  we  have  found  is_  the  best  control  in  the  set  CL    since, 

u*(t)+  p*(t)  dt  =  -1.034  +  .602  [e1-!]  =  0 


and  thus  the  variation  in  J  is  zero  for  small  changes  in 
this  class  of  control. 

The  method  is  simply  extended  to  other  classes  of  con- 
trol.  If  the  control  is  selected  from  the  set  Ci   . 


6j  = 


t  +At 

Or 


V  K(x(t),u(t),p(t),t)   6u(t)dt 

L-  u 


tn+At 

■■-r  i 

VuK(x(t) ,u(t),p(t),t)J6u(t)dt 
^1 

(tf"to) 
here   At  =  — ^ — —it.  ,  =  t.+At,  j  =  0,1 

t  =0 
o 

6u(t)  c[02] 

t.  +  At 

i  Mr  i 

and     6Uj  =  -  -%;   J   |_Vu3C(x(t),u(t),p(t),t)Jdt,   j  =  0,1 

j 

When  applied  to  the  example  problem,  a  set  of  differ- 
ence equations  is  obtained. 

u^1"^  =  -0.352xQ  -0.2811^^  -0.385u2^l) 

u2(l+1)  =  -0.583xQ  -0.360u1(l)  -0.642u2(l) 

The  solution  of  these  equations  is : 

u*  =  -0.757,  u2   =  -1.255,   for  xQ  =  4.0 
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When  these  controls  are  applied  to  the  regulator  the  re- 
sults are: 

x*(l)  =  0.795,  and  J*  =  1.169 

The  optimal  solution  presented  above  compares  closely  to 
the  solution  attained  by  a  direct  search  of  controls  in 
the  set  O- . 
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7.   COMPUTER  PROGRAM  AND  NUMERICAL  RESULTS 
An  explanation  of  the  computational  program  is  pre- 
sented.  The  section  concludes  with  the  results  of  problems 
computed  using  the  standard  and  modified  gradients. 
7 . 1   The  Gradient  Program 

A  flow  diagram  of  the  gradient  algorithm  was  presented 
in  section  2.   This  formulation  was  followed  in  coding  the 
computer  programs  following.   The  main  program  accomplishes 
all  "bookkeeping"  operations  required  and  contains  logic 
statements  for  problem  termination.   Various  subroutines 
are  called  to  perform  computations  and  other  auxiliary  func- 
tions.  The  interconnection  of  these  programs  is  shown  in 
Figure  4. 

The  principal  subsections  of  the  main  program  and  the 
subroutines  are  as  follows : 

A.  The  input  section 

The  following  inputs  are  required. 

1.  The  state  initial  condition  vector  -x 

— o 

2.  The  initial  and  final  times 

3.  The  allowable  integration  step  size  for 
numerical  accuracy 

4.  The  number  of  control  intervals  in  the 
problem 

5.  The  convergence  criterion 

B.  Problem  initialization 

This  section  divides  the  problem  time  into  the 
desired  number  of  equal  control  intervals  and 
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compares  the  length  of  an  interval  with  the  al- 
lowable integration  step  size.   If  the  interval 
length  is  too  great,  it  is  divided  into  subinter- 
vals  until  the  allowable  integration  step  is 
greater  than,  or  equal  to,  a  subinterval  length. 
The  integration  step  size  is  then  made  equal  to  a 
subinterval  and  the  control  during  an  interval  is 
made  up  of  identical  controls  in  contained  subin- 
tervals.   This  maintains  integration  accuracy 
while  allowing  flexibility  in  the  number  of  con- 
trol intervals  selected.   Other  values  which  need 
be  calculated  only  once  for  a  particular  problem 
are  generated  at  this  time  and  stored  for  future 
use. 

C.  Evaluation  of  performance  index  and  calculation 
of  the  gradient  vector 

This  process  is  achieved  by  calling  an  integration 
subroutine  with  a  control  history  u(t). 

D.  Termination  decision 

Problems  are  terminated  when  the  norm  of  the  gra- 
dient vector  becomes  less  than  the  preset  conver- 
gence criterion.   This  norm  is  taken  to  be  of  the 
form 

L-l  pt.+At  2 

l|yuK"="L    I  [it]      vuK(2£(t),u(t),p_(t),t)dt] 
i=0  fci 

where  L  is  the  number  of  control  intervals. 
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An  alternate  exit  is  made  after  a  preset  number 
of  iterations. 

E.  Search  for  improved  control 

This  search  is  performed  by  calling  a  subroutine 
of  the  desired  search  algorithm.   This  section 
returns  to  section  C. 

F.  Output  variables  of  interest 

The  control  history  and  up  to  four  state  variable 
trajectories  are  tabulated  and  graphs  of  desired 
variables  are  plotted.   This  completes  the  compu- 
tation. 

G.  Integration  subroutine 

This  routine  integrates  the  state  equations  and 
integral  terms  of  the  performance  measure  forward 
in  time  when  called  with  a  given  choice  of  con- 
trol.  At  the  final  time  non  integral  terms  of 
the  performance  measure  are  evaluated  and  the 
cost  (J(u/   )  is  calculated.   An  exit  is  provided 
at  this  point  when  a  new  gradient  is  not  to  be 
computed.   If  the  gradient  is  to  be  computed,  the 
costate  boundary  conditions  are  evaluated  (£(t^.)), 
and  the  states,  costates,  and  the  gradient  equa- 
tion are  integrated  backwards  in  time.   During 
the  integration,  the  integral  mean  value  of  the 
gradient  is  computed  for  each  control  interval 
and  stored.   When  the  initial  time  is  reached, the 
gradient  generator  is  called  with  the  array  of 
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mean  values  and  exit  is  made  to  the  calling  pro- 
gram. 

H.   Search  subroutine 

This  routine  computes  a  new  trial  control  u     (t) 
by  placing  experiments  at  a  distance  j9  along  the 
gradient  vector.   An  experimental  control  is  gen- 
erated by  choosing  a  value  of  0  and  calling  the 
control  generator.   An  experiment  is  made  by  cal- 
ling the  integration  subroutine  with  this  experi- 
mental control  and  evaluating  the  performance 
measure.   The  best  distance  (j3  )  is  computed  ac- 
cording to  whichever  algorithm  (golden  section 
search  or  quadratic  approximation)  is  selected. 
The  new  control  is  calculated  by  the  control  gen- 
erator and  exit  is  made  to  the  calling  program. 

I.   Gradient  generator 

When  called  with  a  vector  of  parameters  whose  di- 
mension is  equal  to  the  number  of  control  inter- 
vals, this  subroutine  generates  a  new  vector 
whose  dimension  is  the  number  of  integration  steps 
in  the  problem.   All  values  of  the  newly  generated 
vector  in  a   control  interval  have  the  same  value 
as  the  single  parameter  associated  with  that  in- 
terval. 

J.   Control  generator 

This  routine  is  called  with  a  reference  control,  a 
reference  gradient  and  a  scalar  multiplier.   The 
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new  control  generated  is  the  sum  of  the  reference 
control  and  the  product  of  the  scalar  and  the 
gradient. 
This  same  program  is  modified  to  compute  a  solution 
according  to  the  standard  gradient  technique  by  evaluating 
the  gradient  equation  at  the  beginning  of  each  control  in- 
terval and  using  this  vector  as  the  gradient. 
7.2   A  Linear  Regulator 

The  program  was  initially  applied  to  the  linear  regu- 
lator problem  of  section  6.   An  integration  step  size  of 
0.01  and  an  initial  guess  of  control,  u    (t)  =  0.0  were 

used.   The  control  history  was  selected  from  ^-inn.   A  gold- 

-5 
en  section  search  with  a  convergence  interval  of  1x10   was 

employed  to  ensure  accuracy  in  determining  the  best  step 

sixe.   The  convergence  criterion  for  the  gradient  norm  was 

lxl0~6. 

Both  methods  converged  on  the  second  iteration  to  a 
value  of  1.16126633  for  the  performance  measure. 
Results  of  the  problems  are  shown  in  Table  2. 

The  fortunate  initial  guess  of  control  and  the  struc- 
ture of  the  problem  enable  both  methods  to  achieve  conver- 
gence after  only  two  iterations.   Of  interest  is  the  pri- 
mary indicator  of  convergence,  the  norm  of  the  gradient 
vector,  which  has  indicated  convergence  with  the  modified 
method  but  has  failed  to  achieve  even  the  same  order  of 
magnitude  using  standard  techniques.   Both  of  these  compu- 
tations incorporated  an  exit  when  the  difference  between 
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Iteration 

Coat 

Gradient 
Nor:.? 

Gradient 
Angular 
Chmnga 

A* 

Standard   Gradient 

0 

2.166:36427 

370.7307 1639 

86.8° 

0.638821 

1 

1.36126633 

0.002117  87 

17  9.9° 

0.072949 

2 

1.161266,33 

0.00158204 

-- 

— - 

Modified   Gradient 

0 

2.16636427 

374.45974144 

86.8° 

0.636446 

1 

.   1. 163 26633 

0.00003283 

179.9° 

0.499995 

2 

1.16126633 

0.00000015 

-- 

-- 

T-.ble    2 
Linof.r  Rerul-tor    -   Computation  History 
u<°)(t)    z    0.0 
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successive  costs  was  less  than  lxlO~  . 

Another  computation  was  made  with  this  cost  exit  re- 
moved and  an  initial  guess  of  -1.0  for  the  control.   A  max- 
imum of  20  iterations  was  permitted.   The  results  of  this 
test  are  shown  in  Table  3. 

While  the  initial  guess  of  -1.0  for  control  is  closer 
to  being  optimal  than  the  guess  of  0.0  used  previously, 
both  methods  are  in  a  poorer  starting  position  computation- 
ally.  The  standard  gradient  technique  fails  to  achieve 
convergence  after  20  iterations  (the  maximum  allowed) .  Of 
primary  interest  are  the  gradient  angular  change  and  the 
step  size,  ($   .   While  the  standard  gradient  is  roughly 
orthogonal  initially,  as  the  optimum  is  approached  succes- 
sive gradients  are  very  nearly  identical;   moreover,  the 
step  size  drops  to  the  minimum  value  the  search  algorithm 
can  generate.   A  gradient  supposedly  exists  but  search 
along  this  gradient  fails  to  disclose  a  better  control. 

The  modified  gradient  computation,  on  the  other  hand, 
indicates  a  rather  smooth  convergence  to  the  gradient  norm 
cutoff  and  the  orthogonality  of  successive  gradients  is 
maintained  throughout  the  computation.  The  step  size,  P  , 
exhibits  a  tendency  to  form  a  pattern  and,  more  important- 
ly, improves  the  control  at  each  iteration.  It  is  conclu- 
ded, therefore,  that  the  method  has  calculated  the  gra- 
dient accurately. 

The  optimal  value  of  the  performance  measure  generated 
by  the  modified  method  compares  favorably  with  the  true 
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Iters tion 

Cost 

Gradient 
Norm 

Gradient 
"Annuls r 
'•'Change 

£* 

Stnndnrd   Gr 

adleiit 

0 

1,20468767 

9.45135627 

97.1° 

0.860359 

1 

1.16360626 

0.82419079 

91.4° 

O.o66713 

2 

1.16142950 

0.02993304 

74.2° 

1.026390 

3      . 

1.16126862 

0.00065597 

109.0° 

0.. 000006 

4 

1.16126862 

0.00065596 

0.1° 

0.000005 

5 

• 

.     1.16126862 

• 

0.00065594 

• 

0.1° 

• 

0.00.'.  005 

• 

• 
• 

20 

• 
• 

1.16126  862 

• 
0.00066577 

• 
• 

• 
• 

Modified   G 

r?dl?rit 

0 

1.20468767 

9.60531177 

97.0° 

0.845526 

1 

1. 163 97 y 91 

0.8o556082 

90.0°^ 

0.694645 

2 

1.16143656 

0.03768012 

90.0° 

0.845523 

rt 

1.16127727 

0.00336629 

90.0° 

0.594538 

4 

1.16126729 

0.00014781 

90.0° 

0.845523 

5 

1.16126667 

0.00001316 

90.0° 

0.59447  8 

6 

1.16126663 

0.00000068 

-- 

-- 

Table   3 


Linear  RerulMor   -   Coniout«tion  History 
(0) 


u 


(t)    ~    -1.0 
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optimal  value  of  1.16126214  computed  by  digital  evaluation 
of  the  analytic  solution  of  the  problem.   The  error  arises 
from  the  piecewise  constant  control  restriction. 
7. 3   Continuous  Stirred  Tank  Reactor 

A  more  ambitious  undertaking  is  the  solution  of  the 
continuous  stirred  tank  reactor  problem  posed  by  Lapidus 
and  Luus 

The  state  equations  of  the  system  are: 

,25x  (tK 
xx(t)  =  -2(x1(t)+0.25)+(x2(t)+0.5)exp  (^  {^)+2J 


-(x1(t)+0.25)u(t) 


.25X,  (tK 
x2(t)  =  0.5  -  x2(t)  -  (x2(t)+0.5)exp  (^^ +2) 

and  the  performance  measure  to  be  minimized  is : 
[x12(t)+   x22(t)  +  0.1u2(t)]  dt 


^fr 


J  = 


Forming  the  augmented  performance  measure  and  utiliz- 
ing the  necessary  conditions  results  in  the  formulation  of 
the  costate  equations  and  gradient  equation. 


px(t)  =  -2x1(t)+2p1(t) 


-Pl(t) (x2(t)+0.5) 


50 


L(x1(t)+2)-J 
+Pl(t)u(t)+p2(t)  (x2(t  )  +  0.  5) 


.25x1(t). 
expU  (t)+2 


50 


L(x1(t)+2)' 


(t) 

.25x1(t). 
exPWt)+2  ) 
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.25x  (t.K       f      .25x  (IK-, 
p2(t)=  _2x2(t)-p1(t)exp(^-liriT>  p2(t)[_l+exp^(t)+2jj 

Px(tf)  =  p2(tf)  -  0 

VuK(x(t),u(t),p(t),t)  =  0.2u(t)-p1(t)[x1(t)+  0.25_ 

The  other  given  data  are : 
tQ  =  0.0,  tf  =  0.78 

x, „  =  0.05,  x20  =  0.0,  x(tf)   unspecified 

The  control  to  be  generated  is  the  flow  of  coolant 
through  a  coil  immersed  in  the  reactor.   x, (t)  is  the  de- 
viation from  the  desired  steady  state  temperature  and 
Xp(t)  is  the  deviation  from  the  desired  steady  state  che- 
mical concentration  in  the  reactor. 

Solution  of  this  problem  using  variation  of  extremals 
indicates  the  optimum  value  of  the  performance  measure  to 
be  .02660336.   As  in  the  linear  regulator  problem,  the  ini' 
tial  investigation  is  a  comparison  between  the  standard 
gradient  method  and  the  modified  technique.   The  step  size 
is  determined  by  a  golden  section  search. 

Three  exits  were  provided 

-7 

1.  Norm  of  gradient  less  than  1x10 

2.  Improvement  between  successive  iterations  less 
than  1x10 "6 

3.  20  iterations  have  been  completed 

The  integration  step  size  selected  was  0.01  and  78 

(0) 
control  intervals  were  provided.   uv   (t)  was  chosen  as 
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0.0.   The  results  of  the  investigation  are  shown  in  Table  4. 

From  this  data,  the  principal  advantage  of  the  modi- 
fied gradient  method  is  the  orthogonality  maintained  by 
successive  gradients,  reinforcing  the  earlier  conclusion 
that  this  method  is  computing  an  improved  control  gradient. 
Both  methods  terminated  when  the  difference  between  succes- 
sive evaluations  of  the  performance  measure  was  sufficient- 
ly small,  but  the  standard  gradient  terminated  more  slowly 
and  to  a  greater  value. 

With  this  exit  removed,  the  modified  gradient  conver- 
ged to  the  desired  gradient  norm  on  the  ninth  iteration. 
The  value  of  J  computed  was  0.2660936.   Successive  gradi- 
ents  maintained  the  orthogonality  property  throughout  the 
process. 

Under  the  same  conditions,  the  standard  gradient  meth- 
od failed  to  converge  after  20  iterations.   Successive  gra- 
dients became  less  and  less  orthogonal  as  the  optimum  was 
approached  and  the  algorithm  began  to  break  down  after  16 
iterations.   The  minimum  value  of  J  computed  was  0.0266120L 

Some  further  insight  into  the  convergence  tendencies 
of  the  two  methods  is  gained  by  graphically  comparing  the 
logarithms  of  the  gradient  norms  and  the  normalized  errors 

in  the  performance  measure.   This  latter  quantity  is  gener- 

(i)    j(i) -  J  * 

ated  as  e     =  * ,  where  J   is  the  value  computed 

J 

using  variation  of  extremals.   These  results  are  shown  in 
Figures  5  and  6  respectively. 

The  modified  gradient  technique  converges  smoothly  to 


47 


Iteration 

Cost 

Log1Q  . 
Gradient 

Norm 

Gradient 
Annuls r 
Change 

$* 

Stands rS  Gt 

ad±ent 

0 

0.*  22083207 

-1.03914642 

106.4° 

3  .623640 

1 

0.02729169 

-3.45231247 

81.3° 

1.956198 

2 

0.026975.32 

-2.93352318 

88.9° 

0.302558 

3 

0. 02677792 

~3.9o951030 

104.8° 

0.656046 

4 

0.0267  4786 

-3.80313587 

83 . 3° 

0.4037  86 

5 

0.02670966 

-3.95664883 

102.2° 

0.295981 

6 

0.02669719 

-4.31466280 

72.9° 

1.216354 

7 

0.02666300 

-3.71768951 

94.6° 

0.190202 

8 

0.02664876 

-4.68956352 

o5.3° 

1 .5665  06 

9 

0.02663578 

-4.23  825169 

84.2° 

0.543573 

10 

0.02662267 

-4.6981;  728 

124.2° 

0.155175 

11 

0.02662192 

-5.14793015 

-- 

— 

Modified   G 

radient 

0 

0.22? 8? 207 

-1.05710602 

106.4° 

1.6o7650 

1 

0.02729702 

-3.48081589 

90.0° 

3.600311 

2 

0.02672917 

-3.12831783 

90.0° 

0.256390 

3 

0.02662924 

-5.04592037 

90.0° 

3.58650C . 

"       4 

0.02061315 . 

-4.58679676 

90.0° 

0.2293  26 

5 

0.02661015 

-6.45678598 

90.0° 

3.644996 

6 

. 0.02660963 

-5.92702679 

— 

Table  4 
Stirred  Te.nk  Reactor  -  Commutation  Historv 
u(0)(t)  «  0.0 
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Modified  gradient 


Standard  gradient  — 


Figure  6 
Stirred  Tank  R&actor  -  Log^Q  Grr.dient  l,or::i  vs.  Iterations 
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Modified  gradient 
Standard  gradient 


8    S   10    11 


Figure  6 
Stirred  T*nk  Reactor  -  Percent. Cost  Error  vs.  Iterations 
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the  optimum  value  for  the  class  of  control  while  the  stand- 
ard technique  behaves  somewhat  erratically  and  converges 
more  slowly  once  values  "near"  the  optimum  are  attained. 

The  logarithm  of  the  gradient  norm  exhibits  a  defin- 
ite pattern  of  convergence  with  the  modified  gradient  and 
is  again  rather  erratic  under  the  standard  method.   In  a 
further  computation  using  the  modified  gradient  technique, 

the  pattern  displayed  in  Figure  5  was  maintained  up  to  a 

-13 
convergence  criterion  of  1x10    .   This  computation  con- 
verged after  19  iterations  although  no  measurable  improve- 
ment in  the  value  of  the  performance  measure  was  attained 
beyond  the  ninth  iteration  value  of  0.02660936. 

All  of  the  above  investigations  involved  golden  sec- 
tion searches  in  the  determination  of  the  correct  step 
size  and,   while  convergence  using  the  modified  gradient 
method  is  achieved  in  relatively  few  iterations,  the  number 
of  experiments  performed  in  the  search  is  rather  high.   A 
total  of  195  experiments  were  performed  by  the  modified 
gradient  technique  in  computing  the  solution  of  the  pre- 
sently considered  problem. 

The  next  investigation  utilized  the  quadratic  approx- 
imation method  to  determine  a  step  size  which  gives  effi- 
cient improvement  with  relatively  few  experiments.   While 
the  number  of  iterations  may  be  expected  to  increase,  it 
is  expected  that  the  total  number  of  experiments  will  be 
decreased. 

A  comparison  between  this  method  and  a  golden  section 
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Quadratic  approximation 
Golden   section   search 


20    40    60   80    100   120   140   160   180   200 
Experiments •» 

Figure  7 

Stirred  Tank  Reactor  -  Percent  Cost  Error  vs . .Exneriments 
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Quadratic  approxi:nn  tion 
Golden  section   search 


Figure  8 
Stirred  Tank  Reector  -  percent  Cost  Srror  Va.  Iterations 


53 


search  is  shown  in  Figures  7  and  8. 

The  number  of  experiments  made  in  the  computation  is^ 
markedly  reduced  (to  57)  by  using  the  quadratic  approxima- 
tion, and,  while  poorer  initially  than  the  golden  section, 
compares  quite  favorably  with  that  method  as  the  optimum 
is  approached.   Successive  gradients  using  the  quadratic 
approximation  behave  erratically  during  the  initial  iter- 
ations, but  become  nearly  orthogonal  as  the  optimum  is  ap- 
proached.  Computation  time  is  sharply  decreased  as  expec- 
ted from  the  reduced  number  of  experiments  required  when 
using  the  quadratic  approximation. 
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8.   CONCLUSIONS 

The  problem  of  computing  a  control  which  minimizes  a 
selected  performance  measure  has  been  stated  and  a  number 
of  proposed  methods  for  solution  have  been  presented. 

Motivation  for  the   use  of  the  gradient  algorithm  has 
been  discussed  and  a  method  of  improving  the  convergence 
of  this  procedure  utilizing  integral  mean  values  of  the 
gradient  equation  has  been  derived.   Finally,  an  improved 
method  of  determining  an  approximate  gradient  step  size  has 
been  presented. 

The  selection  of: 

t  .+At 

6UJ  =  Kjf  ]k%(t),u(t)(p(t);t)]dt 
t. 

: 

is  similar  to  the  choice  made  by  Hasdorff  and  Gupta 
who  applied  the  method  to  the  solution  of  sampled  data 
systems. 

Further  reseach  is  required  in  the  evaluation  of: 

t  .+At 

Jt  F  (x,u,...,u  ,p)=^J  :[vuK(x(t),u(t),p(t),t)]dt 
J  J  J  fcj 

which  may  prove  useful  in  determining  a  more  accurate  value 
for  5u . .   Efforts  in  this  direction  are  felt  likely  to 
prove  fruitful  in  achieving  further  improvements  in  the 
technique. 


55 


REFERENCES 

1.  Bellman,  R.  E.,  Dynamic  Programming,  Princeton  Univer- 
sity Press,  Princeton,  New  Jersey,  1957. 

2.  Bellman,  R.  E.  and  Dreyfus,  S.  E.,  Applied  Dynamic  Pro- 
gramming, Princeton  University  Press,  Princeton,  New 
Jersey,  1962. 

3.  Kirk,  D.  E.,  Optimal  Control  Theory;  An  Introduction, 
Ch.  5,  to  be  published. 

4.  Kelley,  H.  J.,  "Gradient  Theory  of  Optimal  Flight 
Paths",  Journal  of  the  American  Rocket  Society,  Vol. 
30,  pp.  947-953,  October,  1960. 

5.  Kelley,  H.  J.,  Kopp,  R.  E.  and  Moyer,  H.  G. ,  "A  Tra- 
jectory Optimization  Technique  Based  Upon  the  Theory 
of  the  Second  Variation",  Progress  in  Astronautics  and 
Aeronautics,  Vol.  14,  1964. 

6.  Bryson,  A.  E.  and  Denham,  W.  F. ,  "A  Steepest-Ascent 
Method  for  Solving  Optimum  Programming  Problems", 
Journal  of  Applied  Mechanics,  Vol.  29,  pp.  247-257, 
June,  1962. 

7.  Breakwell,  J.  V.,  "The  Optimization  of  Trajectories", 
Journal  of  the  Society  of  Industrial  and  Applied  Math- 
ematics, Vol.  7,  pp.  215-247,  June,  1959. 

8.  McGi.il,  R.  and  Kenneth,  P.,  "Solution  of  Variational 
Problems  by  Means  of  a  Generalized  Newton-Raphson 
Operator",  Journal  of  the  American  Institute  of  Aero- 
nautics and  Astronautics,  Vol.  2,  pp.  1761-1766,  Octo- 
ber, 1964. 

9.  Wilde,  D.  J.,  Optimum  Seeking  Methods,  pp.  32-35, 
Prentice-Hall,  Inc.,  Englewood  Cliffs,  New  Jersey, 1964 

10.  Lapidus,  L.  and  Luus,  R. ,  "The  Control  of  Nonlinear 
Systems",  Journal  of  the  American  Institute  of  Chemi- 
cal Engineers,  Vol.  13,  pp.  108-113,  January,  1967. 

11.  Hasdorff,  L.  and  Gupta,  S.C.,  "An  Iterative  Procedure 
for  Optimal  Control  of  a  System  by  Sampled  Input", 
Journal  of  Electronics  and  Control,  Vol.  16,  pp. 
177-188,  February,  1964. 


56 


INITIAL  DISTRIBUTION  LIST 


No. Copies 


1.  Defense  Documentation  Center  20 
Cameron  Station 

Alexandria,  Virginia  22314 

2.  Library  2 
Naval  Postgraduate  School 

Monterey,  California  93940 

3.  Commander-Ships  Systems  Command 
Department  of  the  Navy 
Washington,  D.C.  20350 

4.  Commander-Air  Systems  Command  (Code  AIR-09B)    1 
Department  of  the  Navy 

Washington,  D.C.  20360 

5.  Professor  Donald  E.  Kirk  1 
Department  of  Electrical  Engineering 

Naval  Postgraduate  School 
Monterey,  California  93940 

6.  Professor  Harold  A.  Titus  1 
Department  of  Electrical  Engineering 

Naval  Postgraduate  School 
Monterey,  California  93940 

7.  LCDR  John  A.  Battenburg  1 
6683  Maury  Drive 

San  Diego,  California  92119 


57 


UNCLASSIFIED 


Security  Classification 


DOCUMENT  CONTROL  DATA  -R&D 

{Security  classification  of  title,    body  of  abstract  and  indexing  annotation  must  be  entered  when   the  overall  report  Is  c  la  t  silled 


I,   originating   AC  Tl  VI  TY  (Corporate  author) 

Naval  Postgraduate  School 
Monterey,  California  93940 


2a.   REPORT    SECURITY    C  L  A  SSI  Fl  C  A  T  I  O* 

UNCLASSIFIED 


2b.    GROUP 


3.    REPORT    TITLE 


An  Improved  Gradient  Algorithm  for  the  Solution  of  Two-Point  Boundary- 
Value  Problems. 


4.   DESCRIPTIVE  NOTES  (Type  of  raport  and.tnclusive  dates) 


E.  E.  Thesis  -  Electrical  Engineering  -  1968 


5-   AUTHOR(S)  (First  name,  middle  initial,  last  name) 


John  Allen  Battenburg 

Lieutenant  Commander,  United  States  Navy 


6.    REPORT    DATE 

June    1968 


7a.    TOTAL    NO.    OF    PAGES 


59 


76.    NO      OF    RE  FS 


11 


8a.    CONTRACT    OR    GRANT   NO. 


b.    PROJECT  NO. 


9a.    ORIGINATOR'S    REPORT    NUMBER(S) 


9b.   OTHER   REPORT   NOISI  (Any  other  numbers   that  may  be  assigned 
this  report) 


tO.    DISTRIBUTION   STATEMENT 


m 


It.    SUPPLEMENTARY    NOTES 


12.    SPONSORING   MILI  TARY    ACTIVITY 


Naval  Postgraduate  School 
Monterey,  California  93940 


13.  ABSTRACT 


The  gradient  method  for  solving  two-point  boundary-value  pro- 
blems is  discussed  and  a  modification  of  the  gradient  direction  is 
proposed.   An  algorithm  for  efficiently  determining  the  step  size 
is  also  derived.   Analytic  and  numerical  examples  illustrating  the 
efficiency  of  the  method  are  presented. 


DD 


">«"  1473 

I  NOV  it  I  *t  /  w 

S/N   0101-807-681 1 


(PAGE    1) 


UNCLASSIFIED 


Security  Classification 


a- 3 140  8 


59 


UNCLASSIFIED 

Security  Classification 


KEY     WO  R  OS 


Optimization 

Computation 

Gradient 

Variational  Calculus 

Steepest  Ascent 

Search 

Data   Processing 

Two-Point  Boundary-Value   Problems 

Control  Systems 


LINK     A 


« 


1 


DD  ,?o?..1473  <back, 


S/N    010I-JQ7-6IJ2I 


UNCLASSIFIED 


Security  Classification 


fin 


